Skip to content

Conversation

laxmareddyp
Copy link
Collaborator

@laxmareddyp laxmareddyp commented Oct 8, 2025

Description of the change

Checkpoints for the model have been uploaded to Kaggle and the presets are now being registered for use.

  • I have added all the necessary unit tests for my change.
  • I have verified that my change does not break existing code and works with all backends (TensorFlow, JAX, and PyTorch).
  • My PR is based on the latest changes of the main branch (if unsure, rebase the code).
  • I have followed the Keras Hub Model contribution guidelines in making these changes.
  • I have followed the Keras Hub API design guidelines in making these changes.
  • I have signed the Contributor License Agreement.

Copy link
Contributor

Summary of Changes

Hello @laxmareddyp, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates the Qwen3_MoE model into Keras Hub by adding and registering its presets. It introduces two new files: __init__.py for registration and qwen3_moe_presets.py which defines configurations and Kaggle checkpoint handles for two Qwen3_MoE models (30b_a3b_en and 235b_a22b_en). The intent is to make these pre-trained models readily available for Keras users.

Highlights

  • Intent: This pull request aims to integrate the Qwen3_MoE model presets into Keras Hub. The primary goal is to register configurations for two Qwen3_MoE models, making their pre-trained checkpoints, which have been uploaded to Kaggle, accessible for use within the Keras ecosystem.
  • Changes Summary: The PR introduces two new files under keras_hub/src/models/qwen3_moe/:
  1. __init__.py: This file serves as the entry point for the Qwen3_MoE model within Keras Hub. It imports the Qwen3MoeBackbone and the newly defined backbone_presets, then registers these presets using the register_presets utility function.
  2. qwen3_moe_presets.py: This file defines backbone_presets, a dictionary containing detailed configurations for two Qwen3_MoE models:
    • qwen3_moe_30b_a3b_en: A Mixture-of-Experts (MoE) model with 30.5 billion total parameters (3.3 billion activated).
    • qweqwen3_moe_235b_a22b_en: A larger MoE model with 235 billion total parameters (22 billion activated).
      Each preset includes metadata such as a description, total parameter count, and a kaggle_handle pointing to its respective checkpoint on Kaggle. This effectively adds the necessary metadata and registration logic for these models.
  • Reviewer Activity Summary: No reviewer activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

Thank you for adding the Qwen3-MoE presets. I've found a couple of minor issues in the preset definitions that should be addressed to ensure they are correct and clear for users. My review is based on the Keras Hub Model contribution guidelines and API design guidelines, focusing on correctness and documentation clarity as outlined in the repository's style guide (e.g., lines 335-339, 366-371).

laxmareddyp and others added 3 commits October 8, 2025 16:30
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
@laxmareddyp laxmareddyp added the kokoro:force-run Runs Tests on GPU label Oct 9, 2025
@kokoro-team kokoro-team removed the kokoro:force-run Runs Tests on GPU label Oct 9, 2025
@laxmareddyp
Copy link
Collaborator Author

JAX GPU test failures are related to time out. Thanks

@sachinprasadhs sachinprasadhs merged commit 6f72208 into keras-team:master Oct 9, 2025
10 of 11 checks passed
@laxmareddyp laxmareddyp deleted the laxma_qwen3_checkpoints branch October 9, 2025 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants